Modulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis
نویسندگان
چکیده
This paper presents a novel training algorithm for Hidden Markov Model (HMM)-based speech synthesis. One of the biggest issues causing significant quality degradation in synthetic speech is the over-smoothing effect often observed in generated speech parameter trajectories. Recently, we have found that a Modulation Spectrum (MS) of the generated speech parameters is sensitively correlated with the over-smoothing effect, and have proposed the parameter generation algorithm considering the MS. The over-smoothing effect is effectively alleviated by the proposed parameter generation algorithm. On the other hand, it loses the computationally-efficient generation processing of the conventional generation algorithm. In this paper, the MS is integrated into the training stage instead of the parameter generation stage in a similar manner as our previous work on Gaussian Mixture Model (GMM)-based spectral parameter trajectory conversion. The trajectory HMM is trained with a novel objective function consisting of both the conventional trajectory HMM likelihood and a newly implemented MS likelihood. This training framework is further extended to the F0 component. The experimental results demonstrate that the proposed algorithm yields improvements in synthetic speech quality while preserving a capability of the computationallyefficient generation processing.
منابع مشابه
An introduction of trajectory model into HMM-based speech synthesis
In the synthesis part of a hidden Markov model (HMM) based speech synthesis system which we have proposed, a speech parameter vector sequence is generated from a sentence HMM corresponding to an arbitrarily given text by using a speech parameter generation algorithm. However, there is an inconsistency: although the speech parameter vector sequence is generated under the constraints between stat...
متن کاملSpeech Parameter Sequence Modeling with Latent Trajectory Hidden Markov Model
The weakness of hidden Markov models (HMMs) is that they have difficulty in modeling and capturing the local dynamics of feature sequences due to the piecewise stationarity assumption and the conditional independence assumption on feature sequences. Traditionally, in speech recognition systems, this limitation has been circumvented by appending dynamic (delta and delta-delta) components to the ...
متن کاملReformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences
In the present paper, a trajectory model, derived from the hidden Markov model (HMM) by imposing explicit relationships between static and dynamic feature vector sequences, is developed and evaluated. The derived model, named trajectory HMM, can alleviate some limitations of the standard HMM, which are i) piece-wise constant statistics within a state and ii) conditional independence assumption ...
متن کاملSpeech trajectory discrimination using the minimum classification error learning
In this paper, we extend the maximum likelihood (ML) training algorithm to the minimum classification error (MCE) training algorithm for discriminatively estimating the state-dependent polynomial coefficients in the stochastic trajectory model or the trended hidden Markov model (HMM) originally proposed in [2]. The main motivation of this extension is the new model space for smoothness-constrai...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کامل